Rank in Wordlist | Frequency | Word |
---|---|---|
26133 | 1 | 22,106,049 |
26217 | 1 | 29,600,000 |
26267 | 1 | 30,5 |
26655 | 1 | 8,128,940 |
26683 | 1 | 85,7 |
37321 | 1 | تهران,در |
45392 | 1 | شد,گفت |
49492 | 1 | كرديم,هر |
53574 | 1 | ميشود,كه |
56894 | 1 | هفتم, |
Rank in Wordlist | Frequency | Word |
---|---|---|
4429 | 17 | امام(ره |
8435 | 7 | خمینی(ره |
9363 | 6 | حسین(ع |
11940 | 4 | ايران(ايسنا)، |
12383 | 4 | خمينی(ره |
13642 | 4 | پيامبر(ص |
15130 | 3 | خميني(ره |
15405 | 3 | رضا(ع |
15886 | 3 | علی(ع |
16440 | 3 | معصومه(س |
Rank in Wordlist | Frequency | Word |
---|---|---|
1992 | 47 | ايسنا)، |
11940 | 4 | ايران(ايسنا)، |
12813 | 4 | ع)، |
14858 | 3 | تن)، |
17565 | 2 | 04:00)به |
17566 | 2 | 04:00)جایگاه |
17567 | 2 | 04:00)روز |
25613 | 2 | ۵)، |
25725 | 1 | 04:00)اتحادیه |
25726 | 1 | 04:00)در |
Rank in Wordlist | Frequency | Word |
---|---|---|
25713 | 1 | %شده |
26085 | 1 | 20% |
58742 | 1 | ٢١% |
63055 | 1 | ۲۰% |
63322 | 1 | ۵۰% |
63417 | 1 | ۷۰% |
63473 | 1 | ۸۵% |
Rank in Wordlist | Frequency | Word |
---|---|---|
26767 | 1 | AT&T |
27044 | 1 | R&D |
Rank in Wordlist | Frequency | Word |
---|---|---|
23854 | 2 | نداریم'، |
26506 | 1 | 59220'، |
27245 | 1 | «Moody's |
27246 | 1 | «Moody's» |
30524 | 1 | است'، |
35274 | 1 | بوده'، |
38050 | 1 | جزایر'نیکوبار |
38443 | 1 | جوزاني'، |
41240 | 1 | دهم'، |
51816 | 1 | مسکن'، |
Rank in Wordlist | Frequency | Word |
---|---|---|
3827 | 21 | http://mag |
6462 | 10 | http://news |
7820 | 8 | شد/ |
13904 | 3 | 5/1 |
17626 | 2 | 2/2 |
17655 | 2 | 3/3 |
17678 | 2 | 5/13 |
17679 | 2 | 5/4 |
17688 | 2 | 6/1 |
17976 | 2 | آسیا/ |
In the last subsection of this type we look for words containing other special characters: , ( ) % & $
" ' + * = / _
Depending on the language some of these characters may be allowed within words, other will not. If words with forbidden characters do not have very low frequency there might be a problem in preprocessing.
Words containing %:
select w_id-100,freq, word from words where w_id>100 and word like "%\%%" limit 10;
3.12.1 Words with Hyphens
3.12.2 Multiwords
3.12.3 (Multi-)Words with dots